Objective:¶
In this lab, the objective is to develop and evaluate models that classify images of dogs and cats effectively, both by designing a neural network and using transfer learning with a pre-trained model, VGG16. This exercise will help in understanding how to build and fine-tune deep learning models on a specific task with proper evaluation and insight into the performance of the model.¶
Importing Libraries¶
import os, shutil, pathlib
from tensorflow import keras
from tensorflow.keras import layers
import pathlib
from tensorflow.keras.utils import image_dataset_from_directory
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from PIL import Image
from pathlib import Path
from sklearn.metrics import confusion_matrix, classification_report, precision_recall_curve, auc, ConfusionMatrixDisplay
Obtaining the Data¶
The dataset is taken from: Kaggle - Dogs versus Cats¶
1. Loading Data¶
data_folder = pathlib.Path("C:\\Users\\Tejaswini Marolia\\Downloads\\kaggle_dogs_vs_cats_small")
train_dataset = image_dataset_from_directory(
data_folder / "train",
image_size=(180, 180),
batch_size=32)
validation_dataset = image_dataset_from_directory(
data_folder / "validation",
image_size=(180, 180),
batch_size=32)
test_dataset = image_dataset_from_directory(
data_folder / "test",
image_size=(180, 180),
batch_size=32)
Found 2000 files belonging to 2 classes. Found 1000 files belonging to 2 classes. Found 2000 files belonging to 2 classes.
# Count images in each class
train_dir = data_folder / "train"
class_counts = {folder.name: len(list(folder.glob("*.jpg"))) for folder in train_dir.iterdir()}
print("Class counts:", class_counts)
Class counts: {'cat': 1000, 'dog': 1000}
2. Visualizing data¶
# Visualize class distribution
sns.barplot(x=list(class_counts.keys()), y=list(class_counts.values()))
plt.title("Class Distribution in Training Data")
plt.xlabel("Class")
plt.ylabel("Number of Images")
plt.show()
This image shows the class distribution of the training dataset in a bar chart manner.
Classes: Classes include "cat" and "dog.".
Balanced: The number of images in both classes is equal, which is reflected in the bars being of equal height.
Balanced Dataset: The dataset will be considered balanced, as the number of samples for each class is equal, which is important in training machine learning models to avoid bias toward one class.
Number of Images: According to the graph, there are about 1000 images per class in the training dataset.
3.Analyzing and Visualizing Image Dimension Distributions¶
# Analyze image dimensions
image_shapes = []
for img_path in train_dir.glob("*/*.jpg"):
with Image.open(img_path) as img:
image_shapes.append(img.size)
# Convert to numpy array
image_shapes = np.array(image_shapes)
widths, heights = image_shapes[:, 0], image_shapes[:, 1]
# Plot distributions of width and height
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
sns.histplot(widths, kde=True, color='blue')
plt.title("Width Distribution")
plt.xlabel("Width (pixels)")
plt.subplot(1, 2, 2)
sns.histplot(heights, kde=True, color='green')
plt.title("Height Distribution")
plt.xlabel("Height (pixels)")
plt.show()
Width Distribution:¶
- The Width Distribution plot (left) indicates that most of the images are clustered around 500 pixels, with a strong peak at this value.
- A number of images have smaller widths below 500 pixels, suggesting variation in the size of images within the dataset.
- There is a smooth curve, or kernel density estimation, overlaying the histogram to show the overall trend, confirming strong central tendency around the value 500 pixels.
Distribution of Heights:¶
- The Height Distribution plot on the right shows two major peaks: one at approximately 400 pixels and another around 500 pixels. This might indicate that the data may be images from two sets of height ranges, such as images with different aspect ratios or from two sources.
- As with widths, there are some smaller height values, but it looks like most of the data falls in these two peaks.
Overall Observations:¶
- The dataset contains some uniformity in image sizes but also displays variability, especially in the height dimension. It is bimodal.
- This may require preprocessing of the images, such as resizing them to one common dimension, depending on your task, for consistency in input size to models.
4. Displaying Sample images of Dogs and Cats¶
def show_sample_images(data_path, class_name, num_images=5):
# Convert data_path to a Path object if it's not already
data_path = Path(data_path)
class_path = data_path / class_name
sample_images = list(class_path.glob("*.jpg"))[:num_images]
plt.figure(figsize=(15, 5))
for i, img_path in enumerate(sample_images):
img = Image.open(img_path)
plt.subplot(1, num_images, i + 1)
plt.imshow(img)
plt.axis('off')
plt.title(class_name.capitalize())
plt.show()
# Display sample images
show_sample_images("C:\\Users\\Tejaswini Marolia\\Downloads\\kaggle_dogs_vs_cats_small\\train","dog",6)
show_sample_images("C:\\Users\\Tejaswini Marolia\\Downloads\\kaggle_dogs_vs_cats_small\\train","cat", 6)
Above are sample images from the "Dogs vs Cats" dataset, from the two classes: "dog" and "cat." Each set contains 5 randomly selected images from the specified directory. These samples show the visual diversity of the dataset, ranging from different breeds to poses and lighting conditions. In the "dog" class, each image shows a different dog in a unique setting, while the "cat" class features a variety of cats. This exploration emphasizes the dataset's variability and potential challenges, such as differing angles, lighting, and noise-including blurry or partially obscured images-factors crucial to consider when training a classification model.
Convolutional Neural Network (CNN)¶
inputs = keras.Input(shape=(180, 180, 3))
x = layers.Rescaling(1./255)(inputs)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model_cnn = keras.Model(inputs=inputs, outputs=outputs)
model_cnn.summary()
Model: "functional"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer (InputLayer) │ (None, 180, 180, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ rescaling (Rescaling) │ (None, 180, 180, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d (Conv2D) │ (None, 178, 178, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 89, 89, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 87, 87, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 43, 43, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_2 (Conv2D) │ (None, 41, 41, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_2 (MaxPooling2D) │ (None, 20, 20, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_3 (Conv2D) │ (None, 18, 18, 256) │ 295,168 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_3 (MaxPooling2D) │ (None, 9, 9, 256) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_4 (Conv2D) │ (None, 7, 7, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 12544) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 1) │ 12,545 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 991,041 (3.78 MB)
Trainable params: 991,041 (3.78 MB)
Non-trainable params: 0 (0.00 B)
CNN Model Summary
The code given below is for a CNN binary image classifier using the Keras API from TensorFlow. Below is an overview:
Input Layer:
- Image size: 180x180, with 3 color channels (RGB).
Data Normalization:
- All pixel values are scaled between the range [0, 1] through
Rescaling(1./255).
- All pixel values are scaled between the range [0, 1] through
**Feature Extraction:
- Five
Conv2Dlayers are utilized to extract features from the images with increasing filters (32, 64, 128, and 256), each using ReLU activation.MaxPooling2Dlayers are followed by each convolutional layer to downsample feature maps and reduce computational complexity.
Flattening:
- A
Flattenlayer converts the multi-dimensional feature maps into a 1D vector.
- A
Output Layer:
- A
Denselayer with a sigmoid activation outputs a probability for binary classification.
- Model Creation:
- Combine all layers into one complete model that is ready for training and prediction.
The model is designed to solve the classic problem of binary classification (dog vs. cat) using convolutional operations to learn features and classify the images.
Compling the Model¶
model_cnn.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
The CNN model is compiled for training that has the following key ingredients, in particular:
Loss Function: Here the loss function of choice would be the binary cross-entropy-the so-called binary classification where an output or response needs to be of only one class among a number of two classes, for instance: a dog versus a cat.
Optimizer: RMSprop is selected as the optimizer, an adaptive learning rate method that helps adjust the model's weights during training. It scales the learning rate based on the moving average of recent gradients, making it effective for training deep learning models.
Metrics: The model assesses the performance by accuracy, which is a measure of the percentage of correct predictions. This metric is usually used in classification tasks to monitor the progress of the model during training.
In essence, this code prepares the CNN for binary classification by specifying the loss function, optimization strategy, and evaluation metric, readying the model for training on data.
Training CNN Model with Checkpoint Callbacks for Best Validation Loss:¶
callbacks = [
keras.callbacks.ModelCheckpoint(
filepath="./models/convnet_from_scratch.keras",
save_best_only=True,
monitor="val_loss")
]
history = model_cnn.fit(
train_dataset,
epochs=30,
validation_data=validation_dataset,
callbacks=callbacks)
Epoch 1/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 97s 2s/step - accuracy: 0.5026 - loss: 1.1999 - val_accuracy: 0.5030 - val_loss: 0.6920 Epoch 2/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 79s 1s/step - accuracy: 0.5122 - loss: 0.6971 - val_accuracy: 0.5000 - val_loss: 0.7002 Epoch 3/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 79s 1s/step - accuracy: 0.5150 - loss: 0.6987 - val_accuracy: 0.5090 - val_loss: 0.6878 Epoch 4/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 84s 1s/step - accuracy: 0.5631 - loss: 0.6878 - val_accuracy: 0.6510 - val_loss: 0.6565 Epoch 5/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 86s 1s/step - accuracy: 0.6136 - loss: 0.6596 - val_accuracy: 0.6180 - val_loss: 0.6485 Epoch 6/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.6411 - loss: 0.6304 - val_accuracy: 0.5990 - val_loss: 0.6706 Epoch 7/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 91s 1s/step - accuracy: 0.6808 - loss: 0.6025 - val_accuracy: 0.7000 - val_loss: 0.5759 Epoch 8/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.7176 - loss: 0.5586 - val_accuracy: 0.7020 - val_loss: 0.5581 Epoch 9/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 86s 1s/step - accuracy: 0.7410 - loss: 0.5360 - val_accuracy: 0.6660 - val_loss: 0.6160 Epoch 10/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 98s 2s/step - accuracy: 0.7744 - loss: 0.4852 - val_accuracy: 0.6410 - val_loss: 0.9055 Epoch 11/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.8117 - loss: 0.4616 - val_accuracy: 0.7060 - val_loss: 0.5771 Epoch 12/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 90s 1s/step - accuracy: 0.8086 - loss: 0.3913 - val_accuracy: 0.7270 - val_loss: 0.6780 Epoch 13/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.8735 - loss: 0.3377 - val_accuracy: 0.7350 - val_loss: 0.6595 Epoch 14/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.8768 - loss: 0.2920 - val_accuracy: 0.6970 - val_loss: 0.7668 Epoch 15/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.9145 - loss: 0.2215 - val_accuracy: 0.7090 - val_loss: 0.8277 Epoch 16/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9298 - loss: 0.1871 - val_accuracy: 0.7040 - val_loss: 1.0892 Epoch 17/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9587 - loss: 0.1045 - val_accuracy: 0.7020 - val_loss: 1.1347 Epoch 18/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9594 - loss: 0.1153 - val_accuracy: 0.7190 - val_loss: 1.1737 Epoch 19/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.9693 - loss: 0.0828 - val_accuracy: 0.7250 - val_loss: 1.1594 Epoch 20/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.9755 - loss: 0.0761 - val_accuracy: 0.7150 - val_loss: 1.3420 Epoch 21/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 87s 1s/step - accuracy: 0.9805 - loss: 0.0535 - val_accuracy: 0.6930 - val_loss: 1.5529 Epoch 22/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 91s 1s/step - accuracy: 0.9736 - loss: 0.0655 - val_accuracy: 0.6870 - val_loss: 1.6293 Epoch 23/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9831 - loss: 0.0644 - val_accuracy: 0.7050 - val_loss: 1.6971 Epoch 24/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.9885 - loss: 0.0367 - val_accuracy: 0.7090 - val_loss: 1.5093 Epoch 25/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9813 - loss: 0.0432 - val_accuracy: 0.7220 - val_loss: 1.8559 Epoch 26/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9752 - loss: 0.0634 - val_accuracy: 0.7200 - val_loss: 1.7970 Epoch 27/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.9889 - loss: 0.0319 - val_accuracy: 0.6550 - val_loss: 3.3642 Epoch 28/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.9872 - loss: 0.0474 - val_accuracy: 0.7060 - val_loss: 1.9469 Epoch 29/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 95s 2s/step - accuracy: 0.9884 - loss: 0.0446 - val_accuracy: 0.7140 - val_loss: 1.7061 Epoch 30/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 98s 2s/step - accuracy: 0.9921 - loss: 0.0288 - val_accuracy: 0.7260 - val_loss: 2.0929
The above code serves as a means through which training of a Convolutional Neural Network, or CNN, model is implemented using a ModelCheckpoint callback, saving the best model based on validation loss during training.
callbacks:
- The ModelCheckpoint callback is defined in a list. It monitors the validation loss (
val_loss) during training and saves the model only when there is an improvement in the validation loss. The model is saved to the specified path"./models/convnet_from_scratch.keras".save_best_only=True: Ensures that the model is saved only when the validation loss improves, preventing overwriting with less optimal models.
model_cnn.fit():
train_dataset: This is the dataset used for training the model.epochs=30: The model will be trained for 30 epochs.validation_data=validation_dataset: This is the dataset that will be used to evaluate the model's performance during training; this provides a means to track the model's generalization ability.
callbacks=callbacks: This defines the callback(s) to be utilized during training; in this instance, the ModelCheckpoint for monitoring validation loss.
Summary:¶
This code trains a CNN model using the provided training and validation datasets for 30 epochs. It saves the model whenever there is an improvement in the validation loss. This helps in retaining the best-performing model throughout the training process.
CNN Model Evaluation¶
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
Accuracy¶
test_model = keras.models.load_model("./models/convnet_from_scratch.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
63/63 ━━━━━━━━━━━━━━━━━━━━ 25s 363ms/step - accuracy: 0.7056 - loss: 0.5752 Test accuracy: 0.711
Confusion Matrix¶
# Get predictions from the model
y_true = np.concatenate([y for x, y in test_dataset], axis=0) # True labels
y_pred_probs = test_model.predict(test_dataset) # Predicted probabilities
y_pred = (y_pred_probs > 0.5).astype(int) # Convert probabilities to binary predictions
# Confusion Matrix
conf_matrix = confusion_matrix(y_true, y_pred)
# Plot Confusion Matrix
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=["Cat", "Dog"], yticklabels=["Cat", "Dog"])
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix")
plt.show()
63/63 ━━━━━━━━━━━━━━━━━━━━ 23s 350ms/step
Classification Report¶
# Classification Report: Precision, Recall, F1-Score
print("Classification Report:")
print(classification_report(y_true, y_pred, target_names=["Cat", "Dog"]))
Classification Report:
precision recall f1-score support
Cat 0.50 0.55 0.53 1000
Dog 0.50 0.45 0.47 1000
accuracy 0.50 2000
macro avg 0.50 0.50 0.50 2000
weighted avg 0.50 0.50 0.50 2000
Precision-Recall Curve¶
# Precision-Recall Curve
precision, recall, _ = precision_recall_curve(y_true, y_pred_probs)
pr_auc = auc(recall, precision)
# Plot Precision-Recall Curve
plt.figure(figsize=(8, 6))
plt.plot(recall, precision, label=f"PR Curve (AUC = {pr_auc:.3f})", color="blue")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.title("Precision-Recall Curve")
plt.legend()
plt.grid()
plt.show()
- The PR curve is plotted with the precision on the y-axis and recall on the x-axis.
- A relatively low overall AUC of 0.485 implies the model does a poor job balancing the two sides.
- The shape of this curve indicates that precision drops initially when recall increases rapidly and then gradually flattens out, with limited improvement in precision for increased recall.
- Overall the PR curve seems to indicate that the model will probably fail to identify actual positives while keeping false positives as low as possible. Further analysis and model optimization might be required to improve performance.
Exploring Incorrect Prediction¶
# Get true labels and predicted probabilities
y_true = np.concatenate([y for x, y in test_dataset], axis=0)
y_pred_probs = test_model.predict(test_dataset)
y_pred = (y_pred_probs > 0.5).astype(int)
# Identify misclassified examples
misclassified_indices = np.where(y_true != y_pred.ravel())[0]
# Load test images
test_images = np.concatenate([x.numpy() for x, y in test_dataset], axis=0)
# Display some misclassified examples
num_examples = min(5, len(misclassified_indices)) # Show up to 5 examples
plt.figure(figsize=(15, 5))
for i, idx in enumerate(misclassified_indices[:num_examples]):
plt.subplot(1, num_examples, i + 1)
plt.imshow(test_images[idx].astype("uint8")) # Show the misclassified image
true_label = "Dog" if y_true[idx] == 1 else "Cat"
predicted_label = "Dog" if y_pred[idx] == 1 else "Cat"
plt.title(f"True: {true_label}\nPred: {predicted_label}", color="red")
plt.axis("off")
plt.tight_layout()
plt.show()
63/63 ━━━━━━━━━━━━━━━━━━━━ 22s 348ms/step
Data Augmentation in CNN¶
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
layers.RandomZoom(0.2),
]
)
plt.figure(figsize=(10, 10))
for images, _ in train_dataset.take(1):
for i in range(9):
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")
The above code performs data augmentation on a batch of images using the Sequential model from Keras. The augmentation includes three different transformations: random horizontal flipping, random rotation up to 0.1 radians, and random zoom up to 20%. Then, it displays the augmented images in a 3x3 grid. In each of the 9 subplots, one augmented image is shown, applying the mentioned transformations to the original batch. The sample images are provided from the train_dataset, and the resultant plot helps visualize how the data augmentation will introduce variety in the input images, hence making the model more robust because it is exposed to different variations during training.
Developing and Putting Together a CNN Model for Data-Aided Binary Classification¶
inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = layers.Rescaling(1./255)(x)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
model.summary()
Model: "functional_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer_2 (InputLayer) │ (None, 180, 180, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ sequential (Sequential) │ (None, 180, 180, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ rescaling_1 (Rescaling) │ (None, 180, 180, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_5 (Conv2D) │ (None, 178, 178, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_4 (MaxPooling2D) │ (None, 89, 89, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_6 (Conv2D) │ (None, 87, 87, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_5 (MaxPooling2D) │ (None, 43, 43, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_7 (Conv2D) │ (None, 41, 41, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_6 (MaxPooling2D) │ (None, 20, 20, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_8 (Conv2D) │ (None, 18, 18, 256) │ 295,168 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_7 (MaxPooling2D) │ (None, 9, 9, 256) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_9 (Conv2D) │ (None, 7, 7, 256) │ 590,080 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_1 (Flatten) │ (None, 12544) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 12544) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 1) │ 12,545 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 991,041 (3.78 MB)
Trainable params: 991,041 (3.78 MB)
Non-trainable params: 0 (0.00 B)
callbacks = [
keras.callbacks.ModelCheckpoint(
filepath="./models/convnet_from_scratch_with_augmentation.keras",
save_best_only=True,
monitor="val_loss")
]
history = model.fit(
train_dataset,
epochs=50,
validation_data=validation_dataset,
callbacks=callbacks)
Epoch 1/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 123s 2s/step - accuracy: 0.5081 - loss: 0.7962 - val_accuracy: 0.5050 - val_loss: 0.6926 Epoch 2/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 122s 2s/step - accuracy: 0.5197 - loss: 0.6939 - val_accuracy: 0.5090 - val_loss: 0.6905 Epoch 3/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 103s 2s/step - accuracy: 0.5436 - loss: 0.6903 - val_accuracy: 0.5230 - val_loss: 0.6961 Epoch 4/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 103s 2s/step - accuracy: 0.5852 - loss: 0.6752 - val_accuracy: 0.5570 - val_loss: 0.7561 Epoch 5/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 95s 2s/step - accuracy: 0.6128 - loss: 0.6443 - val_accuracy: 0.5830 - val_loss: 0.8384 Epoch 6/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 99s 2s/step - accuracy: 0.6266 - loss: 0.6671 - val_accuracy: 0.6270 - val_loss: 0.6768 Epoch 7/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 95s 2s/step - accuracy: 0.6455 - loss: 0.6269 - val_accuracy: 0.6350 - val_loss: 0.6495 Epoch 8/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 95s 2s/step - accuracy: 0.6717 - loss: 0.6047 - val_accuracy: 0.6100 - val_loss: 0.6655 Epoch 9/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 98s 2s/step - accuracy: 0.6834 - loss: 0.6052 - val_accuracy: 0.6530 - val_loss: 0.5998 Epoch 10/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 94s 1s/step - accuracy: 0.6767 - loss: 0.5817 - val_accuracy: 0.6710 - val_loss: 0.5790 Epoch 11/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 102s 2s/step - accuracy: 0.7183 - loss: 0.5657 - val_accuracy: 0.5930 - val_loss: 0.9899 Epoch 12/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 105s 2s/step - accuracy: 0.6951 - loss: 0.5969 - val_accuracy: 0.7010 - val_loss: 0.5748 Epoch 13/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 96s 2s/step - accuracy: 0.6958 - loss: 0.5637 - val_accuracy: 0.6960 - val_loss: 0.5994 Epoch 14/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 95s 2s/step - accuracy: 0.7359 - loss: 0.5343 - val_accuracy: 0.6980 - val_loss: 0.5652 Epoch 15/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.7300 - loss: 0.5175 - val_accuracy: 0.7370 - val_loss: 0.5486 Epoch 16/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.7315 - loss: 0.5267 - val_accuracy: 0.7390 - val_loss: 0.5423 Epoch 17/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 87s 1s/step - accuracy: 0.7323 - loss: 0.5234 - val_accuracy: 0.7490 - val_loss: 0.5344 Epoch 18/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 86s 1s/step - accuracy: 0.7576 - loss: 0.5000 - val_accuracy: 0.7580 - val_loss: 0.5024 Epoch 19/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 86s 1s/step - accuracy: 0.7532 - loss: 0.4982 - val_accuracy: 0.7450 - val_loss: 0.5648 Epoch 20/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 87s 1s/step - accuracy: 0.7522 - loss: 0.5094 - val_accuracy: 0.7380 - val_loss: 0.5261 Epoch 21/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 91s 1s/step - accuracy: 0.7765 - loss: 0.4674 - val_accuracy: 0.6340 - val_loss: 0.7106 Epoch 22/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 93s 1s/step - accuracy: 0.7563 - loss: 0.4931 - val_accuracy: 0.7580 - val_loss: 0.5105 Epoch 23/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 92s 1s/step - accuracy: 0.7905 - loss: 0.4424 - val_accuracy: 0.7640 - val_loss: 0.5357 Epoch 24/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 89s 1s/step - accuracy: 0.7507 - loss: 0.4969 - val_accuracy: 0.5990 - val_loss: 0.7910 Epoch 25/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 92s 1s/step - accuracy: 0.7830 - loss: 0.4568 - val_accuracy: 0.7720 - val_loss: 0.4742 Epoch 26/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 91s 1s/step - accuracy: 0.7898 - loss: 0.4245 - val_accuracy: 0.7610 - val_loss: 0.5972 Epoch 27/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 90s 1s/step - accuracy: 0.7817 - loss: 0.4537 - val_accuracy: 0.7720 - val_loss: 0.5272 Epoch 28/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 91s 1s/step - accuracy: 0.8066 - loss: 0.4164 - val_accuracy: 0.7590 - val_loss: 0.4957 Epoch 29/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 91s 1s/step - accuracy: 0.8027 - loss: 0.4174 - val_accuracy: 0.7930 - val_loss: 0.4516 Epoch 30/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 88s 1s/step - accuracy: 0.8214 - loss: 0.3953 - val_accuracy: 0.7970 - val_loss: 0.4533 Epoch 31/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 96s 2s/step - accuracy: 0.8323 - loss: 0.3865 - val_accuracy: 0.7880 - val_loss: 0.4653 Epoch 32/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 86s 1s/step - accuracy: 0.8143 - loss: 0.3944 - val_accuracy: 0.7550 - val_loss: 0.5166 Epoch 33/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 94s 1s/step - accuracy: 0.8201 - loss: 0.3785 - val_accuracy: 0.7590 - val_loss: 0.5832 Epoch 34/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 93s 1s/step - accuracy: 0.8279 - loss: 0.3972 - val_accuracy: 0.7940 - val_loss: 0.4648 Epoch 35/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 114s 2s/step - accuracy: 0.8135 - loss: 0.3956 - val_accuracy: 0.7850 - val_loss: 0.4957 Epoch 36/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 115s 2s/step - accuracy: 0.8392 - loss: 0.3576 - val_accuracy: 0.7900 - val_loss: 0.5625 Epoch 37/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 113s 2s/step - accuracy: 0.8529 - loss: 0.3463 - val_accuracy: 0.7780 - val_loss: 0.4795 Epoch 38/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 115s 2s/step - accuracy: 0.8489 - loss: 0.3364 - val_accuracy: 0.8090 - val_loss: 0.4368 Epoch 39/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 113s 2s/step - accuracy: 0.8519 - loss: 0.3417 - val_accuracy: 0.7770 - val_loss: 0.5222 Epoch 40/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 114s 2s/step - accuracy: 0.8343 - loss: 0.3538 - val_accuracy: 0.8320 - val_loss: 0.4091 Epoch 41/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 120s 2s/step - accuracy: 0.8541 - loss: 0.3293 - val_accuracy: 0.8190 - val_loss: 0.4391 Epoch 42/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 120s 2s/step - accuracy: 0.8572 - loss: 0.3058 - val_accuracy: 0.8300 - val_loss: 0.4518 Epoch 43/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 114s 2s/step - accuracy: 0.8781 - loss: 0.3130 - val_accuracy: 0.8300 - val_loss: 0.4130 Epoch 44/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 114s 2s/step - accuracy: 0.8841 - loss: 0.2799 - val_accuracy: 0.7930 - val_loss: 0.4777 Epoch 45/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 114s 2s/step - accuracy: 0.8783 - loss: 0.2979 - val_accuracy: 0.8070 - val_loss: 0.6286 Epoch 46/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 112s 2s/step - accuracy: 0.8631 - loss: 0.3098 - val_accuracy: 0.8220 - val_loss: 0.4863 Epoch 47/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 116s 2s/step - accuracy: 0.8824 - loss: 0.3003 - val_accuracy: 0.8510 - val_loss: 0.4163 Epoch 48/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 119s 2s/step - accuracy: 0.8868 - loss: 0.2798 - val_accuracy: 0.8310 - val_loss: 0.4517 Epoch 49/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 98s 2s/step - accuracy: 0.8770 - loss: 0.2970 - val_accuracy: 0.7990 - val_loss: 0.5007 Epoch 50/50 63/63 ━━━━━━━━━━━━━━━━━━━━ 87s 1s/step - accuracy: 0.8783 - loss: 0.2742 - val_accuracy: 0.8180 - val_loss: 0.5921
Accuracy¶
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
Confusion Matrix¶
# Get predictions from the model
y_true = np.concatenate([y for x, y in test_dataset], axis=0) # True labels
y_pred_probs = test_model.predict(test_dataset) # Predicted probabilities
y_pred = (y_pred_probs > 0.5).astype(int) # Convert probabilities to binary predictions
# Confusion Matrix
conf_matrix = confusion_matrix(y_true, y_pred)
# Plot Confusion Matrix
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=["Cat", "Dog"], yticklabels=["Cat", "Dog"])
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix")
plt.show()
63/63 ━━━━━━━━━━━━━━━━━━━━ 19s 293ms/step
Classification Report¶
# Classification Report: Precision, Recall, F1-Score
print("Classification Report:")
print(classification_report(y_true, y_pred, target_names=["Cat", "Dog"]))
Classification Report:
precision recall f1-score support
Cat 0.48 0.54 0.51 1000
Dog 0.48 0.43 0.45 1000
accuracy 0.48 2000
macro avg 0.48 0.48 0.48 2000
weighted avg 0.48 0.48 0.48 2000
Precision-Recall Curve¶
# Precision-Recall Curve
precision, recall, _ = precision_recall_curve(y_true, y_pred_probs)
pr_auc = auc(recall, precision)
# Plot Precision-Recall Curve
plt.figure(figsize=(8, 6))
plt.plot(recall, precision, label=f"PR Curve (AUC = {pr_auc:.3f})", color="blue")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.title("Precision-Recall Curve")
plt.legend()
plt.grid()
plt.show()
Exploring Incorrect Prediction¶
# Get true labels and predicted probabilities
y_true = np.concatenate([y for x, y in test_dataset], axis=0)
y_pred_probs = test_model.predict(test_dataset)
y_pred = (y_pred_probs > 0.5).astype(int)
# Identify misclassified examples
misclassified_indices = np.where(y_true != y_pred.ravel())[0]
# Load test images
test_images = np.concatenate([x.numpy() for x, y in test_dataset], axis=0)
# Display some misclassified examples
num_examples = min(5, len(misclassified_indices)) # Show up to 5 examples
plt.figure(figsize=(15, 5))
for i, idx in enumerate(misclassified_indices[:num_examples]):
plt.subplot(1, num_examples, i + 1)
plt.imshow(test_images[idx].astype("uint8")) # Show the misclassified image
true_label = "Dog" if y_true[idx] == 1 else "Cat"
predicted_label = "Dog" if y_pred[idx] == 1 else "Cat"
plt.title(f"True: {true_label}\nPred: {predicted_label}", color="red")
plt.axis("off")
plt.tight_layout()
plt.show()
63/63 ━━━━━━━━━━━━━━━━━━━━ 18s 286ms/step
VGG16¶
Defining a model for VGG16¶
conv_base = keras.applications.vgg16.VGG16(
weights="imagenet",
include_top=False,
input_shape=(180, 180, 3))
def get_features_and_labels(dataset):
all_features = []
all_labels = []
for images, labels in dataset:
preprocessed_images = keras.applications.vgg16.preprocess_input(images)
features = conv_base.predict(preprocessed_images)
all_features.append(features)
all_labels.append(labels)
return np.concatenate(all_features), np.concatenate(all_labels)
train_features, train_labels = get_features_and_labels(train_dataset)
val_features, val_labels = get_features_and_labels(validation_dataset)
test_features, test_labels = get_features_and_labels(test_dataset)
1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 3s 3s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 2s 2s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 1025s 1025s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 3s 3s/step
train_features.shape
(2000, 5, 5, 512)
inputs = keras.Input(shape=(5, 5, 512))
x = layers.Flatten()(inputs)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model_vgg16 = keras.Model(inputs, outputs)
model_vgg16.summary()
Model: "functional_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer_4 (InputLayer) │ (None, 5, 5, 512) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_2 (Flatten) │ (None, 12800) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 256) │ 3,277,056 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ (None, 256) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 1) │ 257 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 3,277,313 (12.50 MB)
Trainable params: 3,277,313 (12.50 MB)
Non-trainable params: 0 (0.00 B)
model_vgg16.compile(loss="binary_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"])
callbacks = [
keras.callbacks.ModelCheckpoint(
filepath="./models/feature_extraction.keras",
save_best_only=True,
monitor="val_loss")
]
history = model_vgg16.fit(
train_features, train_labels,
epochs=20,
validation_data=(val_features, val_labels),
callbacks=callbacks)
Epoch 1/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 7s 58ms/step - accuracy: 0.8713 - loss: 42.7003 - val_accuracy: 0.9390 - val_loss: 10.0618 Epoch 2/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 35ms/step - accuracy: 0.9755 - loss: 3.6969 - val_accuracy: 0.9720 - val_loss: 3.4074 Epoch 3/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 30ms/step - accuracy: 0.9853 - loss: 1.7286 - val_accuracy: 0.9580 - val_loss: 9.4572 Epoch 4/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 51ms/step - accuracy: 0.9908 - loss: 0.9290 - val_accuracy: 0.9740 - val_loss: 4.3173 Epoch 5/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 33ms/step - accuracy: 0.9936 - loss: 0.8277 - val_accuracy: 0.9770 - val_loss: 3.6608 Epoch 6/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9962 - loss: 0.3710 - val_accuracy: 0.9700 - val_loss: 8.0627 Epoch 7/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9923 - loss: 1.0299 - val_accuracy: 0.9670 - val_loss: 8.2059 Epoch 8/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 31ms/step - accuracy: 0.9978 - loss: 0.1127 - val_accuracy: 0.9750 - val_loss: 4.2843 Epoch 9/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 31ms/step - accuracy: 0.9959 - loss: 0.3736 - val_accuracy: 0.9850 - val_loss: 3.9032 Epoch 10/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9996 - loss: 0.0266 - val_accuracy: 0.9740 - val_loss: 5.9691 Epoch 11/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9968 - loss: 0.1412 - val_accuracy: 0.9810 - val_loss: 4.1631 Epoch 12/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 30ms/step - accuracy: 0.9986 - loss: 0.0716 - val_accuracy: 0.9790 - val_loss: 5.1360 Epoch 13/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 30ms/step - accuracy: 0.9973 - loss: 0.5120 - val_accuracy: 0.9810 - val_loss: 4.2839 Epoch 14/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 30ms/step - accuracy: 0.9996 - loss: 0.0152 - val_accuracy: 0.9820 - val_loss: 4.9624 Epoch 15/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9953 - loss: 0.3655 - val_accuracy: 0.9710 - val_loss: 7.4543 Epoch 16/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 31ms/step - accuracy: 0.9988 - loss: 0.1021 - val_accuracy: 0.9800 - val_loss: 5.2258 Epoch 17/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 29ms/step - accuracy: 0.9997 - loss: 0.0190 - val_accuracy: 0.9780 - val_loss: 6.6710 Epoch 18/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9969 - loss: 0.1831 - val_accuracy: 0.9830 - val_loss: 4.2689 Epoch 19/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 30ms/step - accuracy: 1.0000 - loss: 7.2520e-04 - val_accuracy: 0.9660 - val_loss: 6.9887 Epoch 20/20 63/63 ━━━━━━━━━━━━━━━━━━━━ 2s 28ms/step - accuracy: 0.9960 - loss: 0.5647 - val_accuracy: 0.9770 - val_loss: 3.7790
Evaluating VGG16 model¶
acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, "bo", label="Training accuracy")
plt.plot(epochs, val_acc, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
Accuracy¶
test_model = keras.models.load_model(
"./models/feature_extraction.keras")
test_loss, test_acc = test_model.evaluate(x=test_features, y=test_labels)
print(f"Test accuracy: {test_acc:.3f}")
63/63 ━━━━━━━━━━━━━━━━━━━━ 9s 22ms/step - accuracy: 0.9580 - loss: 5.3827 Test accuracy: 0.964
Confusion Metric¶
# Evaluate model predictions
predictions = test_model.predict(test_features) # Get predicted probabilities
predicted_classes = (predictions > 0.5).astype("int32") # Convert probabilities to binary predictions
# 1. Confusion Matrix
conf_matrix = confusion_matrix(test_labels, predicted_classes)
# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=["Cat", "Dog"], yticklabels=["Cat", "Dog"])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
63/63 ━━━━━━━━━━━━━━━━━━━━ 3s 33ms/step
Precision, Recall, and F1-Score¶
print("Classification Report:")
print(classification_report(test_labels, predicted_classes, target_names=["Cat", "Dog"]))
Classification Report:
precision recall f1-score support
Cat 0.95 0.98 0.96 1000
Dog 0.98 0.95 0.96 1000
accuracy 0.96 2000
macro avg 0.96 0.96 0.96 2000
weighted avg 0.96 0.96 0.96 2000
Precision-Recall Curve¶
precision, recall, thresholds = precision_recall_curve(test_labels, predictions)
# Plot precision-recall curve
plt.figure(figsize=(8, 6))
plt.plot(recall, precision, marker=".", label="Precision-Recall Curve")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.title("Precision-Recall Curve")
plt.legend()
plt.show()
Exploring Incorrect Prediction¶
# Display a few misclassified examples
num_examples = 6
plt.figure(figsize=(15, 5))
for i, idx in enumerate(misclassified_indices[:num_examples]):
plt.subplot(1, num_examples, i + 1)
# Use the original images for visualization
img = test_images[idx]
plt.imshow(img.astype("uint8"))
plt.title(f"True: {test_labels[idx]}, Pred: {predicted_classes[idx][0]}")
plt.axis("off")
plt.show()
Fine-tuning a pretrained model¶
conv_base = keras.applications.vgg16.VGG16(
weights="imagenet",
include_top=False)
conv_base.summary()
Model: "vgg16"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer_5 (InputLayer) │ (None, None, None, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv1 (Conv2D) │ (None, None, None, 64) │ 1,792 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv2 (Conv2D) │ (None, None, None, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_pool (MaxPooling2D) │ (None, None, None, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv1 (Conv2D) │ (None, None, None, │ 73,856 │ │ │ 128) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv2 (Conv2D) │ (None, None, None, │ 147,584 │ │ │ 128) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 128) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv1 (Conv2D) │ (None, None, None, │ 295,168 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv2 (Conv2D) │ (None, None, None, │ 590,080 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv3 (Conv2D) │ (None, None, None, │ 590,080 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv1 (Conv2D) │ (None, None, None, │ 1,180,160 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv2 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv3 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv1 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv2 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv3 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 512) │ │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 14,714,688 (56.13 MB)
Trainable params: 14,714,688 (56.13 MB)
Non-trainable params: 0 (0.00 B)
conv_base.trainable = False
conv_base.summary()
Model: "vgg16"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ input_layer_5 (InputLayer) │ (None, None, None, 3) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv1 (Conv2D) │ (None, None, None, 64) │ 1,792 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_conv2 (Conv2D) │ (None, None, None, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block1_pool (MaxPooling2D) │ (None, None, None, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv1 (Conv2D) │ (None, None, None, │ 73,856 │ │ │ 128) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_conv2 (Conv2D) │ (None, None, None, │ 147,584 │ │ │ 128) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block2_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 128) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv1 (Conv2D) │ (None, None, None, │ 295,168 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv2 (Conv2D) │ (None, None, None, │ 590,080 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_conv3 (Conv2D) │ (None, None, None, │ 590,080 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block3_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 256) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv1 (Conv2D) │ (None, None, None, │ 1,180,160 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv2 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_conv3 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block4_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv1 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv2 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_conv3 (Conv2D) │ (None, None, None, │ 2,359,808 │ │ │ 512) │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ block5_pool (MaxPooling2D) │ (None, None, None, │ 0 │ │ │ 512) │ │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 14,714,688 (56.13 MB)
Trainable params: 0 (0.00 B)
Non-trainable params: 14,714,688 (56.13 MB)
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
layers.RandomZoom(0.2),
]
)
inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = keras.applications.vgg16.preprocess_input(x)
x = conv_base(x)
x = layers.Flatten()(x)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.summary()
Model: "functional_5"
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ │ input_layer_6 │ (None, 180, 180, │ 0 │ - │ │ (InputLayer) │ 3) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ sequential_1 │ (None, 180, 180, │ 0 │ input_layer_6[0]… │ │ (Sequential) │ 3) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ get_item (GetItem) │ (None, 180, 180) │ 0 │ sequential_1[0][… │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ get_item_1 │ (None, 180, 180) │ 0 │ sequential_1[0][… │ │ (GetItem) │ │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ get_item_2 │ (None, 180, 180) │ 0 │ sequential_1[0][… │ │ (GetItem) │ │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ stack (Stack) │ (None, 180, 180, │ 0 │ get_item[0][0], │ │ │ 3) │ │ get_item_1[0][0], │ │ │ │ │ get_item_2[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ add (Add) │ (None, 180, 180, │ 0 │ stack[0][0] │ │ │ 3) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ vgg16 (Functional) │ (None, 5, 5, 512) │ 14,714,688 │ add[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ flatten_3 (Flatten) │ (None, 12800) │ 0 │ vgg16[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_4 (Dense) │ (None, 256) │ 3,277,056 │ flatten_3[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dropout_2 (Dropout) │ (None, 256) │ 0 │ dense_4[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_5 (Dense) │ (None, 1) │ 257 │ dropout_2[0][0] │ └─────────────────────┴───────────────────┴────────────┴───────────────────┘
Total params: 17,992,001 (68.63 MB)
Trainable params: 3,277,313 (12.50 MB)
Non-trainable params: 14,714,688 (56.13 MB)
Freezing all layers until the fourth from the last¶
conv_base.trainable = True
for layer in conv_base.layers[:-4]:
layer.trainable = False
Fine-tuning the model¶
model.compile(loss="binary_crossentropy",
optimizer=keras.optimizers.RMSprop(learning_rate=1e-5),
metrics=["accuracy"])
callbacks = [
keras.callbacks.ModelCheckpoint(
filepath="./models/fine_tuning.keras",
save_best_only=True,
monitor="val_loss")
]
history = model.fit(
train_dataset,
epochs=30,
validation_data=validation_dataset,
callbacks=callbacks)
Epoch 1/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 653s 10s/step - accuracy: 0.7194 - loss: 5.0638 - val_accuracy: 0.9330 - val_loss: 0.6285 Epoch 2/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 588s 9s/step - accuracy: 0.8872 - loss: 0.9489 - val_accuracy: 0.9560 - val_loss: 0.4095 Epoch 3/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 571s 9s/step - accuracy: 0.9114 - loss: 0.7454 - val_accuracy: 0.9620 - val_loss: 0.2809 Epoch 4/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 600s 10s/step - accuracy: 0.9377 - loss: 0.3708 - val_accuracy: 0.9650 - val_loss: 0.2611 Epoch 5/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 563s 9s/step - accuracy: 0.9277 - loss: 0.3501 - val_accuracy: 0.9700 - val_loss: 0.2096 Epoch 6/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 576s 9s/step - accuracy: 0.9491 - loss: 0.2128 - val_accuracy: 0.9720 - val_loss: 0.1949 Epoch 7/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 579s 9s/step - accuracy: 0.9535 - loss: 0.1862 - val_accuracy: 0.9720 - val_loss: 0.1699 Epoch 8/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 585s 9s/step - accuracy: 0.9555 - loss: 0.1985 - val_accuracy: 0.9730 - val_loss: 0.1437 Epoch 9/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 583s 9s/step - accuracy: 0.9636 - loss: 0.1425 - val_accuracy: 0.9720 - val_loss: 0.1585 Epoch 10/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 581s 9s/step - accuracy: 0.9652 - loss: 0.1199 - val_accuracy: 0.9760 - val_loss: 0.1619 Epoch 11/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 585s 9s/step - accuracy: 0.9691 - loss: 0.1197 - val_accuracy: 0.9770 - val_loss: 0.1601 Epoch 12/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 602s 10s/step - accuracy: 0.9729 - loss: 0.0829 - val_accuracy: 0.9730 - val_loss: 0.1691 Epoch 13/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 584s 9s/step - accuracy: 0.9737 - loss: 0.1002 - val_accuracy: 0.9740 - val_loss: 0.1698 Epoch 14/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 569s 9s/step - accuracy: 0.9783 - loss: 0.0638 - val_accuracy: 0.9730 - val_loss: 0.2169 Epoch 15/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 568s 9s/step - accuracy: 0.9655 - loss: 0.1211 - val_accuracy: 0.9760 - val_loss: 0.1671 Epoch 16/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 618s 10s/step - accuracy: 0.9909 - loss: 0.0324 - val_accuracy: 0.9730 - val_loss: 0.1713 Epoch 17/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 581s 9s/step - accuracy: 0.9817 - loss: 0.0704 - val_accuracy: 0.9740 - val_loss: 0.1543 Epoch 18/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9781 - loss: 0.0924 - val_accuracy: 0.9750 - val_loss: 0.1414 Epoch 19/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 572s 9s/step - accuracy: 0.9891 - loss: 0.0537 - val_accuracy: 0.9790 - val_loss: 0.1643 Epoch 20/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9837 - loss: 0.0420 - val_accuracy: 0.9770 - val_loss: 0.1640 Epoch 21/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 575s 9s/step - accuracy: 0.9874 - loss: 0.0408 - val_accuracy: 0.9790 - val_loss: 0.1368 Epoch 22/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 578s 9s/step - accuracy: 0.9852 - loss: 0.0439 - val_accuracy: 0.9790 - val_loss: 0.1319 Epoch 23/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 575s 9s/step - accuracy: 0.9904 - loss: 0.0279 - val_accuracy: 0.9770 - val_loss: 0.1104 Epoch 24/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9861 - loss: 0.0469 - val_accuracy: 0.9820 - val_loss: 0.1243 Epoch 25/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9921 - loss: 0.0426 - val_accuracy: 0.9830 - val_loss: 0.1242 Epoch 26/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9921 - loss: 0.0298 - val_accuracy: 0.9780 - val_loss: 0.1510 Epoch 27/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 576s 9s/step - accuracy: 0.9939 - loss: 0.0246 - val_accuracy: 0.9750 - val_loss: 0.1468 Epoch 28/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9912 - loss: 0.0480 - val_accuracy: 0.9790 - val_loss: 0.1340 Epoch 29/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 574s 9s/step - accuracy: 0.9924 - loss: 0.0316 - val_accuracy: 0.9820 - val_loss: 0.1269 Epoch 30/30 63/63 ━━━━━━━━━━━━━━━━━━━━ 576s 9s/step - accuracy: 0.9905 - loss: 0.0247 - val_accuracy: 0.9790 - val_loss: 0.1367
acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, "bo", label="Training accuracy")
plt.plot(epochs, val_acc, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay, precision_recall_fscore_support, classification_report
from sklearn.metrics import PrecisionRecallDisplay
import numpy as np
# Load the model
model = keras.models.load_model("./models/fine_tuning.keras")
# Evaluate accuracy on the test set
test_loss, test_acc = model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
# Get true labels and predicted probabilities
true_labels = np.concatenate([y for x, y in test_dataset], axis=0)
pred_probs = model.predict(test_dataset)
predictions = (pred_probs > 0.5).astype("int32")
# 1. Confusion Matrix
cm = confusion_matrix(y_true=true_labels, y_pred=predictions)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=["Cat", "Dog"])
disp.plot(cmap="Blues")
plt.title("Confusion Matrix")
plt.show()
# 2. Precision, Recall, and F1-Score
precision, recall, f1, _ = precision_recall_fscore_support(true_labels, predictions, average="binary")
print(f"Precision: {precision:.3f}, Recall: {recall:.3f}, F1-Score: {f1:.3f}")
# Classification Report
print("\nClassification Report:")
print(classification_report(true_labels, predictions, target_names=["Cat", "Dog"]))
# 3. Precision-Recall Curve
disp = PrecisionRecallDisplay.from_predictions(true_labels, pred_probs, name="Model Precision-Recall")
plt.title("Precision-Recall Curve")
plt.show()
# 4. Misclassified Examples
misclassified_indices = np.where(true_labels != predictions.flatten())[0]
print(f"Number of misclassified examples: {len(misclassified_indices)}")
# Explore specific misclassified examples
import matplotlib.pyplot as plt
for idx in misclassified_indices[:5]: # Show first 5 misclassified examples
image = test_images[idx]
true_label = "Dog" if test_labels[idx] == 1 else "Cat"
predicted_label = "Dog" if predictions[idx] == 1 else "Cat"
plt.imshow(image.astype("uint8"))
plt.title(f"True: {true_label}, Predicted: {predicted_label}")
plt.axis("off")
plt.show()
63/63 ━━━━━━━━━━━━━━━━━━━━ 335s 5s/step - accuracy: 0.9711 - loss: 0.1979 Test accuracy: 0.973 63/63 ━━━━━━━━━━━━━━━━━━━━ 350s 6s/step
Precision: 0.527, Recall: 0.521, F1-Score: 0.524
Classification Report:
precision recall f1-score support
Cat 0.53 0.53 0.53 1000
Dog 0.53 0.52 0.52 1000
accuracy 0.53 2000
macro avg 0.53 0.53 0.53 2000
weighted avg 0.53 0.53 0.53 2000
Number of misclassified examples: 947
Accuracy¶
model = keras.models.load_model("./models/fine_tuning.keras")
test_loss, test_acc = model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
63/63 ━━━━━━━━━━━━━━━━━━━━ 333s 5s/step - accuracy: 0.9703 - loss: 0.1887 Test accuracy: 0.973
Confusion Matrix¶
# Get true labels and predicted probabilities
true_labels = np.concatenate([y for x, y in test_dataset], axis=0)
pred_probs = model.predict(test_dataset)
predictions = (pred_probs > 0.5).astype("int32")
63/63 ━━━━━━━━━━━━━━━━━━━━ 361s 6s/step
cm = confusion_matrix(y_true=true_labels, y_pred=predictions)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=["Cat", "Dog"])
disp.plot(cmap="Blues")
plt.title("Confusion Matrix")
plt.show()
Precision, Recall, and F1-Score¶
# 2. Precision, Recall, and F1-Score
precision, recall, f1, _ = precision_recall_fscore_support(true_labels, predictions, average="binary")
print(f"Precision: {precision:.3f}, Recall: {recall:.3f}, F1-Score: {f1:.3f}")
Precision: 0.502, Recall: 0.496, F1-Score: 0.499
Precision-Recall Curve¶
# 4. Precision-Recall Curve
precision_vals, recall_vals, _ = precision_recall_curve(y_true, y_pred)
plt.plot(recall_vals, precision_vals, color='blue')
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.title("Precision-Recall Curve")
plt.show()
# Area under the precision-recall curve
pr_auc = auc(recall_vals, precision_vals)
print(f"Area Under Precision-Recall Curve (PR AUC): {pr_auc:.3f}")
Area Under Precision-Recall Curve (PR AUC): 0.605
Exploring Incorrect Prediction¶
import numpy as np
import matplotlib.pyplot as plt
# List to store misclassified examples
wrong_predictions = []
# Loop through the test dataset
for img_batch, label_batch in test_dataset:
# Get model predictions for the batch
preds = model.predict(img_batch)
# Iterate over each image in the batch
for i in range(len(preds)):
# Convert predictions to binary (0 for Cat, 1 for Dog)
pred_label = 1 if preds[i] > 0.5 else 0
true_label = label_batch[i].numpy() # True label
# Check if prediction is incorrect
if pred_label != true_label:
# Append the misclassified image, its true label, and predicted label
wrong_predictions.append((img_batch[i], true_label, pred_label))
# Displaying a few failed predictions
num_examples_to_show = 5
plt.figure(figsize=(15, 5))
# Loop through the misclassified examples
for i in range(min(num_examples_to_show, len(wrong_predictions))):
img, true_label, pred_label = wrong_predictions[i]
# Plot the image
plt.subplot(1, num_examples_to_show, i + 1)
plt.imshow(img)
plt.axis('off')
plt.title(f"True: {'Cat' if true_label == 0 else 'Dog'}\nPred: {'Cat' if pred_label == 0 else 'Dog'}")
plt.show()
1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 8s 8s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 9s 9s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 7s 7s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 5s 5s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 6s 6s/step 1/1 ━━━━━━━━━━━━━━━━━━━━ 4s 4s/step
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.8093324..253.46295]. Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..246.92595]. Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.90005493..255.0]. Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..255.0]. Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [82.68366..255.0].
Conclusion and Insights on Results¶
- Data Augmentation Benefits:
The application of random flipping, rotation, and zoom significantly increased the dataset's diversity, reducing overfitting risk. Augmented data introduces variability in images, enabling the model to generalize better when encountering unseen data during testing.
- Model Architecture:
The CNN architecture, with its five convolutional layers and associated pooling, progressively extracts hierarchical features from the images. The inclusion of dropout helps mitigate overfitting, ensuring that the model generalizes effectively.
- Performance Trends:
- Initial Epochs: The training and validation accuracy begin relatively low (~50%), which is expected as the model starts learning the patterns in the data. The loss values indicate that the model initially struggles to differentiate between the two classes effectively.
- Mid to Late Epochs: By epoch 15, accuracy increases significantly, with validation accuracy stabilizing around 74–75%, and validation loss decreasing. This indicates the model is learning and generalizing better over time.
- Overfitting Signs: Validation loss begins to fluctuate slightly in later epochs, suggesting potential overfitting. This is somewhat mitigated by the dropout layer and data augmentation.
- Model Performance: The highest observed validation accuracy reaches approximately 76%, which is reasonable for binary classification tasks on augmented image data. Validation loss stabilizes around 0.5 after epoch 18, showing that the model has effectively minimized its prediction error.
- Training Speed: Each epoch takes ~90–120 seconds, which reflects the computational cost of a deep model with data augmentation.
Insight on the Accuracies¶
Training Accuracy:¶
• Increase over the epochs: The training accuracy increases at each epoch, which means it learns the pattern of the training data. It shows very low accuracy at the beginning but much better in later epochs; therefore, the model learns incrementally to identify features and classify images correctly.
- Final Training Accuracy: If the final training accuracy stabilizes in the range of 85–90%, that means that your model has successfully fitted to the training data. This could also point out a possible overfitting issue if there is a significant gap between the training and validation accuracy.
Validation Accuracy:¶
- Initial Performance: It starts off with a validation accuracy of around 50%, which is expected at the beginning of training, since it is not yet tuned. At the beginning of training, over the first few epochs, the model is basically just guessing.
- Stabilizing Trend: From epoch 10–15, the validation accuracy starts to increase smoothly. A ~75–76% validation accuracy means that the model is generalizing well to the unseen data of the validation set.
- Model's Generalization: An accuracy of ~75% is quite decent for a binary classification task on augmented image data. However, there is some room for improvement. If the training accuracy is much higher than the validation accuracy, this could point to overfitting. In this case, the model would be memorizing the training data, rather than learning the general features underlying it.
Training vs. Validation Accuracy:¶
- Moderate Gap: If there is a gap between training and validation accuracy, it could indicate overfitting of the model, especially if the training accuracy is very high compared to the validation accuracy. Overfitting occurs when a model learns to perform extremely well on the training set but performs poorly on new data.
- Regularization Impact: Dropout layers and data augmentation help to mitigate this by introducing randomness during training, which in turn prevents the model from memorizing the training data. This improvement in performance on the validation set suggests that the dropout and augmentation strategies are effective.
Early Epochs (Epoch 1–5):¶
- Low Accuracy and High Loss: For the first few epochs, both training and validation accuracies are comparatively low, usually around 50%. This is expected since the model has just started to learn. The loss is normally high in this phase since the model weights are still being adjusted and convergence of the learning process has not occurred yet. Exploration Phase: In these epochs, the model explores the feature space to comprehend the primary patterns of the images.
Middle Epochs (Epoch 6–15):¶
- Steady Accuracy Improvement: After around 6–15 epochs, the model starts showing consistent improvement in accuracy, with validation accuracy going up to ~70–75%. This is a good indication that the model is effectively learning from the data it has seen so far and improving its predictions.
- Accuracy Plateau: The model's accuracy starts to increase more slowly after approximately epoch 15, which can be interpreted as the model beginning to reach a point where it has learned the main features from the dataset. This behavior is common as the model begins to converge.
Late Epochs (Epoch 16–20):¶
- Possible Overfitting Indicators: By the late epochs, the training accuracy might reach a high value (e.g., ~85–90%) while the validation accuracy might plateau around ~75–76%. This gap could indicate that the model is overfitting to the training data, and further regularization methods (e.g., increased dropout or L2 regularization) might be necessary to help the model generalize better.
- Plateau of Validation Accuracy: If the validation accuracy is almost flat while the training accuracy is increasing, this might mean that the model has learned most of the useful patterns from the training data but struggles to generalize on the validation data due to complexity or noise.
Key Takeaways from the Accuracy Analysis:¶
- Generalization: The accuracy of 75–76% is a good number for a binary classification model, which means that the model learns useful features from augmented data.
- Overfitting Risk: The gap between training and validation accuracy at the end of the epochs suggests some overfitting. Techniques like early stopping, further regularization, or using a more complex architecture could potentially help reduce this gap.
- Performance Ceiling: The model seems to have reached a performance ceiling, where the validation accuracy has stabilized at around 75%. To further improve this, you might want to try techniques such as more aggressive data augmentation, model fine-tuning, or leveraging pre-trained models for transfer learning.